Pruning closed itemset lattices for associations rules

نویسندگان

  • Nicolas Pasquier
  • Yves Bastide
  • Rafik Taouil
  • Lotfi Lakhal
چکیده

Rsumm La ddcouverte des rgles d'association est l'un des principaux probllmes de l'extraction de connaissances dans les bases de donnnes. De nombreux algorithmes eecaces ont tt proposss, dont les plus remarquables sont Apriori, l'algorithme de Mannila, Partition, Sampling et DIC. Ces derniers sont tous basss sur la mmthode de recherche de Apriori: l''lagage du treillis des parties (treillis des itemsets). Dans cet article, nous proposons un algorithme eecace bass sur une nouvelle mmthode de recherche: l''lagage du treillis des fermms (treillis des itemsets fermms). Ce treillis qui est un sous-ordre du treillis des parties est troitement lii au treillis de concepts de Wille dans son analyse formelle de concepts. Nous avons compar exprimentalement Close une version optimisse de Apriori et les rsultats obtenus montrent la grande eecacitt de Close dans le traitement des donnnes denses et/ou corrlles telles que les donnnes de rescensement (cas diicile). Nous avons galement pu observer que Close donne des temps de rponse corrects dans le traitement des bases de donnnes de ventes. Abstract Discovering association rules is one of the most important task in data mining and many eecient algorithms have been proposed in the literature. The most noticeable are Apriori, Mannila's algorithm, Partition, Sampling and DIC, that are all based on the Apriori mining method: pruning of the subset lattice (itemset lattice). In this paper we propose an eecient algorithm, called Close, based on a new mining method: pruning of the closed set lattice (closed itemset lattice). This lattice, which is a sub-order of the subset lattice, is closely related to Wille's concept lattice in formal concept analysis. Experiments comparing Close to an optimized version of Apriori showed that Close is very eecient for mining dense and/or correlated data such as census data, and performs reasonably well for market basket style data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Further Pruning for Efficient Association Rule Discovery

The Apriori algorithm’s frequent itemset approach has become the standard approach to discovering association rules. However, the computation requirements of the frequent itemset approach are infeasible for dense data and the approach is unable to discover infrequent associations. OPUS AR is an efficient algorithm for association rule discovery that does not utilize frequent itemsets and hence ...

متن کامل

Traversing Itemset Lattices with Statistical Metric Pruning

ABSTRACT We study how to e ciently compute signi cant association rules according to common statistical measures such as a chi-squared value or correlation coe cient. For this purpose, one might consider to use of the Apriori algorithm, but the algorithm needs major conversion, because none of these statistical metrics are anti-monotone, and the use of higher support for reducing the search spa...

متن کامل

A lattice-based approach for mining most generalization association rules

Traditional association rules consist of some redundant information. Some variants based on support and confidence measures such as non-redundant rules and minimal non-redundant rules were thus proposed to reduce the redundant information. In the past, we proposed most generalization association rules (MGARs), which were more compact than (minimal) non-redundant rules in that they considered th...

متن کامل

Accelerating Closed Frequent Itemset Mining by Elimination of Null Transactions

The mining of frequent itemsets is often challenged by the length of the patterns mined and also by the number of transactions considered for the mining process. Another acute challenge that concerns the performance of any association rule mining algorithm is the presence of „null‟ transactions. This work proposes a closed frequent itemset mining algorithm viz., Closed Frequent Itemset Mining a...

متن کامل

Mining Non- Redundant Frequent Pattern in Taxonomy Datasets using Concept Lattices

In general frequent itemsets are generated from large data sets by applying various association rule mining algorithms, these produce many redundant frequent itemsets. In this paper we proposed a new framework for Non-redundant frequent itemset generation using closed frequent itemsets without lose of information on Taxonomy Datasets using concept lattices. General Terms Frequent Pattern, Assoc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998